Mann-Whitney U Test
The Mann-Whitney U test, also known as the Wilcoxon rank-sum test, is a non-parametric test used to compare differences between two independent groups when the assumption of a normally distributed data cannot be assumed. It is often used as an alternative to the independent samples t-test when data are not normally distributed.
Mann-Whitney U Test
Purpose: The Mann-Whitney U Test is used to compare differences between two independent groups when the dependent variable is either ordinal or continuous, but not normally distributed. It is the non-parametric alternative to the independent two-sample t-test.
How it Works:
The test works by ranking all the values from both groups together. The ranks are then used to calculate the U statistic (a measure of the number of times a score from one group precedes a score from the other group).
The test essentially assesses whether one group tends to have higher or lower values than the other, without assuming a specific distribution of the scores.
Assumptions
The Mann-Whitney U test is based on the following assumptions:
-
Independence of Samples: The samples from the two groups must be independent of each other.
-
Ordinal Data: The data do not need to be normally distributed, but should be ordinal or continuous.
-
Similarity of Shape: The distributions of the two groups should have the same shape, allowing for a difference in medians.
Hypotheses
The hypotheses for the Mann-Whitney U test are framed as follows:
-
Null Hypothesis (H₀): There is no difference in the medians of the two groups.
-
Alternative Hypothesis (H₁): There is a difference in the medians of the two groups.
Calculation Steps
- Combine all observations from both groups into a single dataset.
- Rank all observations from the lowest to the highest, handling ties by assigning to each tied value the average of the ranks they would have otherwise occupied.
- Calculate the sum of ranks for each group.
- Use the sum of ranks to compute the U statistic for each group.
Interpretation
The smaller U value is used for the test statistic. This value is then compared to a critical value from the Mann-Whitney U distribution table (or calculated using an approximation for large samples). If the calculated U is less than the critical value from the table, or if the p-value is less than the chosen alpha level, the null hypothesis is rejected, indicating a significant difference between the groups.
Example Problem
Consider two groups of patients treated with different methods to reduce symptoms. Group A consists of 6 patients and Group B consists of 6 patients. Their scores are:
-
Group A: 120, 101, 130, 115, 100, 130
-
Group B: 85, 90, 110, 115, 120, 125
Hypotheses:
-
Null Hypothesis (H₀): The median symptom reduction is equal between both treatments.
-
Alternative Hypothesis (H₁): The median symptom reduction differs between the treatments.
Mann-Whitney U Test using Excel:
Download the Excel file link here
Mann-Whitney U Test using R:
NOTE:
- In R, the Mann-Whitney U test is known as the Wilcoxon rank sum test when it’s applied to two independent samples, and it is indeed performed using the wilcox.test function. This naming might cause some confusion, but they are essentially the same test.
Code
# Scores for two groups
group_a <- c(120, 101, 130, 115, 100, 130)
group_b <- c(85, 90, 110, 115, 120, 125)
# Perform Mann-Whitney U test
mw_test <- wilcox.test(group_a, group_b)
# Print the results
print(mw_test)
Wilcoxon rank sum test with continuity correction
data: group_a and group_b
W = 24, p-value = 0.376
alternative hypothesis: true location shift is not equal to 0
Mann-Whitney U Test using Python:
Code
from scipy.stats import mannwhitneyu
# Scores for two groups
group_a = [120, 101, 130, 115, 100, 130]
group_b = [85, 90, 110, 115, 120, 125]
# Perform Mann-Whitney U test
u_statistic, p_value = mannwhitneyu(group_a, group_b, alternative='two-sided')
# Print the results
print("U Statistic:", u_statistic, "P Value:", p_value)
U Statistic: 24.0 P Value: 0.3759621824832893
This test allows researchers and analysts to assess the evidence against the null hypothesis in a manner that is robust to non-normal data distributions.